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Abstract —Most real-world networks exhibit community struc¬ 
ture, a phenomenon characterized by existence of node clusters 
whose intra-edge connectivity is stronger than edge connectivities 
between nodes belonging to different clusters. In addition to 
facilitating a better understanding of network behavior, com¬ 
munity detection finds many practical applications in diverse 
settings. Communities in online social networks are Indicative of 
shared functional roles, or affiliation to a common socio-economic 
status, the knowledge of which is vital for targeted advertisement. 
In buyer-seller networks, community detection facilitates better 
product recommendations. Unfortunately, reliability of commu¬ 
nity assignments is hindered by anomalous user behavior often 
observed as unfair self-promotion, or “fake” highly-connected 
accounts created to promote fraud. The present paper advocates 
a novel approach for jointly tracking communities while detecting 
such anomalous nodes in time-varying networks. By postulating 
edge creation as the result of mutual community participation by 
node pairs, a dynamic factor model with anomalous memberships 
captured through a sparse outlier matrix is put forth. Efficient 
tracking algorithms suitable for both online and decentralized 
operation are developed. Experiments conducted on both syn¬ 
thetic and real network time series successfully unveil underlying 
communities and anomalous nodes. 

Index Terms —Community detection, anomalies, non-negative 
matrix factorization, low rank, sparsity. 

I. Introduction 

Networks underlie many complex phenomena involving 
pairwise interactions between entities [8], [17]. Examples 
include online social networks such as Facebook or Twitter, 
e-mail and phone correspondences among individuals, the 
Internet, and electric power grids. Most network analyses focus 
on static networks, with node and link structures assumed 
fixed. However, real-world networks often evolve over time 
e.g., new links are frequently added to the web. Incorporating 
such temporal dynamics plays a fundamental role towards a 
better understanding of network behavior. 

Community identification is one of the most studied tasks in 
modern network analysis [9], [12]. Fundamentally, communi¬ 
ties pertain to the inherent grouping of nodes, with many edges 
connecting nodes belonging to the same cluster, and far fewer 
edges existing between clusters. Cognizant of the temporal be¬ 
havior inherent to networks, several recent works have focused 
on the task of tracking time-varying communities [13], [18], 
[20], [28], [29]. Identification of dynamic communities finds 
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applications in many settings e.g., grouping subscribers of an 
online social network into functional roles for more informative 
advertising, or clustering blogs into content groups, facilitating 
improved recommendations to readers. 

Classical community detection approaches predominantly 
resort to well-studied unsupervised machine learning algo¬ 
rithms e.g., hierarchical and spectral clustering [14], [16], [19]. 
To facilitate interpretability, several authors have postulated 
that a network exhibits community structure if it contains 
subnetworks whose expected node degree exceeds that of a 
random graph; see e.g., modularity [22]. Many of these meth¬ 
ods conduct hard community assignment, whereby no node 
can jointly belong to more than one community. Nevertheless, 
communities in real networks tend to overlap with, or even 
completely contain others e.g., a Facebook user may jointly 
belong to a circle of college friends, and another comprising 
workplace colleagues. The quest to unveil possibly overlapping 
communities has been at the forefront of efforts to develop 
more flexible community discovery algorithms, capable of 
associating each node with a per-community affiliation strength 
a.k.a, soft clustering. Among these, factor models e.g., non¬ 
negative matrix factorization (NMF), have recently become 
popular for overlapping community discovery [20], [24], [30]. 

This paper builds upon recent advances in overlapping 
community identification, with focus on dynamic networks. It 
is assumed that temporal variations are slow across observation 
time intervals. In addition, special attention is paid to existence 
of aberrant nodes exhibiting “anomalous” behavior. Such be¬ 
havior may often manifest as unusually strong, uni-directional 
edge connectivity across communities, leading to distortion 
of true communities; see Figure 1. Examples include e-mail 
spammers, or individuals with malicious intent, masquerading 
under “false” Facebook profiles to initiate to connect with as 
many legitimate users as possible. Anomalies identification 
facilitates discovery of more realistic communities. The present 
paper develops algorithms for jointly tracking time-varying 
communities, while compensating for anomalies. 

Several prior works have studied the evolution of general 
temporal behavior in time-varying networks; see e.g., [1], [2], 
[7], [10], [23], [27]. Using tensor and matrix factorizations, 
temporal link prediction approaches are advocated for bipartite 
graphs in [7]. Postulating a state-space dynamic stochastic 
blockmodel for dynamic edge evolution, an extended Kalman 
filter was proposed for tracking communities in [29]. An 
NMF model was advocated for batch recovery of overlapping 
communities in time-varying networks in [20]. Generally, these 
contemporary approaches do not account for the occurrence of 
distortive anomalies, which may hurt the accuracy of commu¬ 
nity assignments. 

The fresh look advocated by the present paper jointly 
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Fig. 1; An illustration of the distortive effect of anoma¬ 
lous nodes; a) A 16-node directed network with four easily- 
discernible communities; and b) The same network with node 
10 exhibiting an unusually high number of outgoing edges. 
Identifying the underlying communities is more challenging, 
as a result of the distortion caused by node 10. 


accounts for temporal variations and anomalous affiliations. 
Motivated by contemporary NMF approaches, a community 
affiliation model in which edge weights are generated by 
mutual community participation between node pairs is adopted 
in Section II. The proposed approach capitalizes on sparsity 
of anomalies, rank deficiency inherent to networks with far 
fewer communities than the network size, as well as slow 
edge variation across time intervals. Under these conditions, 
a sparsity-promoting, rank-regularized, exponentially-weighted 
least-squares estimator is put forth in Section III. Leveraging 
advances in proximal splitting approaches (see e.g., [4]), com¬ 
putationally efficient community tracking algorithms, based on 
alternating minimization are developed. 

In order to appeal to big data contexts, within which most 
social networks of interest arise, a number of algorithmic 
modifications are considered. Towards facilitating real-time, 
memory-efficient operation in streaming data settings, an on¬ 
line algorithm leveraging stochastic gradient descent iterations 
is developed in Section IV-B. Moreover, certain practical 
settings entail storage of network data across many clusters of 
computing nodes, possibly geographically located at different 
sites. In such scenarios, tracking communities in a decentral¬ 
ized fashion is well-motivated. To this end, a tracking algorithm 
that leverages the alternating direction method of multipliers 
(ADMM) is developed in Section V. Numerical tests with 
synthetically generated networks demonstrate the effectiveness 
of the developed algorithms in tracking communities and 
anomalies (Section VI-A). Further experiments in Section VI-B 
are conducted on real data, extracted from global trade flows 
among nations between 1870 and 2009. 

Notation. Bold uppercase (lowercase) letters will denote ma¬ 
trices (column vectors), while operators (•)^, Aniax(’)’ 
diag(-) will stand for matrix transposition, maximum eigen¬ 
value, and diagonal matrix, respectively. The identity matrix 
will be represented by I, while 0 denotes the matrix of all 
zeros. The Ip and Frobenius norms will be denoted by || • ||p, 
and II • II respectively. The indicator function = 1 if 
X evaluates to “true”, otherwise '^{x} = 0. Finally, [X]+ will 
denote projection of X onto the non-negative orthant, with the 
(i,j)-th entry [[X]+].^. = max([X]ij, 0). 


II. Model and Problem Statement 
A. Community affiliation model 

Consider a dynamic directed A^-node network whose time- 
varying topology is captured by the time-series of adjacency 
matrices {A* € Entry {iffi) of A* (hereafter 

denoted by a\j) is nonzero only if an edge originating from 
node i is connected to node j during interval t. It is assumed 
that edge weights are nonnegative, namely a\j > 0. Suppose 
that the network consists of C unknown communities which 
are allowed to overlap, that is a node can belong to one or 
more communities simultaneously. This is motivated by prac¬ 
tical settings where e.g., a Facebook user may be associated 
with multiple communities consisting of her work colleagues, 
former schoolmates, or friends from the local sports club. It can 
be reasoned that the likelihood of two people becoming friends 
is directly related to the number of communities to which they 
mutually belong. For example, if two work colleagues happen 
to have attended the same high school, then chances are high 
that they will become friends. This a fortiori argument based 
upon a reasonably natural observation in social settings lies at 
the foundation for several recent community affiliation models 
for edge generation [20], [30]. 

Suppose V* := [v‘...Vp] € denotes a temporal 

basis matrix whose columns span a linear subspace of dimen¬ 
sion C during observation interval t. Associating each basis 
vector with one of C communities, the edge vector associated 
with each node i can be expressed as a linear combination of 

i.e., 

a‘ = V‘u‘ + e‘, * = A (1) 

where (a*)^ denotes row i of A*, and e* captures unmodeled 
dynamics. Entries of u* := [u\i.. in (1) assign 

community affiliation strengths, with u\^ = 0 only if node 
i does not belong to community c during interval t. Since 
a*, > 0, entries of both V* and u* will be constrained to 
nonnegative values. Collecting edge vectors for all nodes, (1) 
can be expressed equivalently as the following canonical NMF 
model 

A* = U‘(V*)'^+ E‘ (2) 

where (U*)^ := [uj ... u^], and (E‘)^ := [ej ... e^]. The 
asymmetry inherent to (2) generalizes traditional approaches 
(e.g., spectral clustering), rendering them capable of capturing 
communities in weighted, directed, and even bipartite graphs. 
The special case in which edges are undirected (i.e., = ajj 

can be readily realized U* = V‘. 

Contemporary community detection approaches overwhelm¬ 
ingly focus on unipartite networks, whereby edges are allowed 
to exist between any pair of nodes. On the other hand, direc¬ 
tional edges in bipartite networks only connect nodes belonging 
to two distinct classes. Examples include buyer-seller networks 
existing in online applications like E-bay, or recommender net¬ 
works with edges capturing ratings of products by customers. 
With matrices U* £ and V* £ communities 

in an AM-node bipartite network (A nodes belonging to one 
class, and M to the other) can be readily captured through the 
affiliation model (2). 
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Unfortunately, (1) does not effectively capture anomalous 
nodes that exhibit unusually strong affiliation to one or more 
communities. This aberrant behavior has been observed in 
several real-world networks, and it often arises due to any of 
several reasons. For example, the presence of “fake” user ac¬ 
counts in online social networks created for phishing purposes 
from unsuspecting peers may lead to abnormally high numbers 
of outgoing links. In addition, fraudulent reviewers in web- 
based rating applications may exhibit abnormal affiliation while 
unfairly promoting their services to specific communities. 
Regardless of the underlying reason, detection of such nodes 
is envisioned as a source of strategic information for network 
operators. Moreover, identification of anomalies is critical for 
improved community detection accuracy; see Figure 1 for an 
example where an anomalous node distorts the underlying 
community structure. 

B. Outlier-aware community affiliation model 

Suppose node i is considered anomalous, exhibiting an 
abnormally strong level of affiliation in one or more of the C 
communities. In order to preserve the estimation accuracy of 
community discovery algorithms, one is motivated to modify 
(1) so that such outliers are accounted for. The present paper 
postulates the following robust edge generation model 

a‘= V*(u‘-f o*)-f e‘, i = l,...,iV (3) 

where o* := \o\i... , and ^ 0 only if node i 

exhibits anomalous affiliation in community c. Intuitively, (3) 
reasonably suggests that one can investigate whether any node 
is an anomaly or not by introducing more variables that 
compensate for the effect of outliers on the edge generation 
model. Since outliers are rare by definition, vectors 
are generally sparse, and this prior knowledge can be exploited 
to recover the unknowns. 

Letting (O*)^ := [o* .. .o^], the outlier-aware community 
affiliation model in (2) can be written as 

A* = (U*-f 0*)(V*)T-f E‘, f = l,2,.... (4) 

where O* is sparse. In static scenarios with A = (U -|-0)V^, 
setting 0 = 0 yields the canonical NMF model for community 
discovery. Given {A*}^]^, the goal of this paper is to track the 
community affiliation matrices {U*, as well as outliers 

captured through matrices 

It is worth reiterating that (4) is a heavily under-determined 
model, and the only hope to recover {U*,0*,V* }f=i lies 
in exploiting prior information about the structure of the 
unknowns. Indeed, the estimator advocated in the sequel will 
capitalize on sparsity, low rank, and the slow evolution of 
networks. Introducing extra variables to capture outliers has 
been used in different contexts; see e.g., [11] and references 
therein. 

Remark 1 (Measurement outliers): Model 4 is motivated by 
anomalous nodes, whose presence leads to distorted commu¬ 
nity structures in e.g., social networks. A slight variation of this 
problem arises in cases where one is interested in identifying 
which edges are anomalous. This is well-motivated in settings 
where edge weights are directly measured, and encode valuable 
information e.g., star ratings in online review systems. The 


outliers in such cases are the result of faulty measurements, 
bad data (e.g., skewed user ratings), or data corruption. To 
detect anomalous edge weights, one can postulate that [cf. (4)] 

A‘= U*(V‘)^-f O*-f E*, f = 1,2,.... (5) 

where O* is sparse, and can be effectively recovered by lever¬ 
aging sparsity-promoting estimation approaches; see e.g., [11] 
and references therein. 

III. Community Tracking Algorithm 
This section assumes that the following hold; al) O* is 
sparse; a2) U‘(V*)^ is low rank; and a3) evolve 

slowly over time, that is, the sequence of matrices {A* — 
are sparse. In order to justify al, note that the 
set of anomalous nodes is much smaller than that of “ordi¬ 
nary” nodes. On the other hand, a2 results from requiring 
that rank(U‘(V*)^) < C A, while a3 is motivated by 
observations of the evolution of most real-world networks. In 
the sequel, a sequential estimator that exploits al-a3 is put 
forth. 


A. Exponentially-weighted least-squares estimator 

Suppose data are acquired sequentially over time, and stor¬ 
age memory is limited; thus, it is impractical to aim for batch 
estimation. Under the aforementioned assumptions, the follow¬ 
ing sparsity-promoting, rank-regularized, and exponentially- 
weighted least-squares (EWLS) estimator can track the un¬ 
known matrices 


{u*,v‘,6‘} 

t 

= argmin ^/3*-^|| A^ - (U-f 0)V^||^ 
{u,v,o}eiR]['''‘^ T=i 

+ At||UVT||^+p,||0||(, (6) 

where the nuclear norm ||UV^||* := sums 

the singular values of UV^, while ||0||o := J2ik'^{oikA0} 
counts the non-zero entries in O. Regularization parameters 
Xt > 0 and p* > 0 control the low rank of UV^ and sparsity 
in O, respectively. Finally, is a “forgetting” factor with 
/3 G (0,1], which facilitates tracking slow variations by down¬ 
weighing past data when /3 < 1. 

Problem (6) is non-convex and NP-hard to solve. Never¬ 
theless, this can be circumvented by resorting to tight convex 
relaxation. Specifically, ||0||o can be surrogated with ||0||i := 
|oic|, and one can leverage the following characterization 
of the nuclear norm; see e.g., [21] 


Z := 


Liiuii 


V 


which leads to the following optimization problem 


s.t. Z = UV^ 
(7) 


{u‘,v*,6‘} 

t 

argmin ^|| A^ - (U-f 0)V^ 

r=l 

+ Y{l|u||i + ||v||^| + Mt||o||^ (8) 
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whose separability renders it amenable to alternating minimiza¬ 
tion (AM) strategies, as discussed next. 

B. Alternating minimization 

Focusing on the first term of the cost in (8), note that 

t 

^^*-"||A"-(U + 0)VT||^ 

r—1 

= s*Tr|v(U^U + 0^0)V^ -h 2VU^OV^| 

-2Tr|(S*)^(U + 0)V^| (9) 

where s* := (1 — /3‘)/(l — /3) and S* ;= A* + /3S‘“^ recur¬ 
sively accumulate past data with minimal storage requirements. 
Since (8) is separable across the optimization variables, one can 
resort to iterative AM, by alternately solving for each variable 
while holding the others fixed. With 

t 

$(U,V,0,S*,s‘) := ^/3‘-^||A^-(U-t-0)VT||^ (10) 

T —1 

AM iterations amount to the following per-interval updates 
V[k] = argmin $ (U, V[fc - 1], 0[fc - 1], S*, s‘) 

Ugl^NxC 

+ (At/2)||U||^ (11a) 

V[k] = argmin $ (U[fc], V, 0[A: - 1], S‘, s‘) 

VglpNxC 

+ (At/2)||V||5, (11b) 

0[fc] = argmin $ (U[fc], V[/c], O, S‘, s‘) 

OgRNxC 

+f‘t||0||i (11c) 

over iterations indexed by k, until convergence is achieved. 

The constrained subproblems in (11a) and (lib) are convex, 
and can be readily solved via projected gradient (PG) iterations. 
Since the gradients of their cost functions are available, and 
the projection operator onto the nonnegative orthant is well 
defined, PG iterations are guaranteed to eventually converge to 
the optimal solution [5, p. 223]. Per iteration k, PG updates 
amount to setting 

U[fc] = [U[fc - 1] - a„,fcVu$(U[fc - 1], 

V[/c-l],0[fc-l],S*,s*)]^ (12) 

V[k] = [V[fc - 1] - a,,fcVv$(U[fc], 

V[fc-l],0[fc-l],S‘,s‘)]^ (13) 

where au,k and denote (possibly) iteration-dependent step 
sizes. In addition, Vu$(-) (Vv‘l’(-)) denotes the gradient of 
$(.) with respect to U (V). Expressions for the gradients 
can be readily obtained, but are omitted here due to space 
constraints. 

The cost function in (11c) is convex with both smooth and 
non-smooth terms. Recent advances in proximal algorithms 
have led to efficient, provably-convergent iterative schemes 


Algorithm 1 Alternating minimization 

1: Input: {A*}?Li, /3, C 
2: Initialize U[0], V[0], O[0] 

3: Set S° = 0 

4: for f = 1 ... do 

5: Set s‘= (l-/3‘)/(l-/3) 

6: Update S* = A* -I- 

7: Initialize k = 0 

8: repeat 

9: k = k + 1 

10- Set \cyu,k: 

11 : Compute U[A:] via (12) 

12: Compute Y[k] via (13) 

13: r = 0, Wr[k] = 0^[fc] = 0[k - l],0r[k] = 1 

14: repeat 

15: Update Xr[fc] via (15) 

16: 0,[fc]= 

17: er+l[k] = (1 + Vl + 402[fc])/2 

18: Update Wj.+i[A:] via (16) 

19: r = r -f 1 

20 : until Oj.[A:] converges 

21: 0[fc] = Or[k] 

22: until {U[fc], V[fc], 0[A:]} converge 

23: tj* = U[A:],V* = V[fc],6* = 0[fc] 

24: end for 


for solving such optimization problems. We will resort to the 
fast iterative shrinkage thresholding algorithm (FISTA) whose 
accelerated convergence rate renders it attractive for sequential 
learning [4]. 

C. FISTA for outlier updates 

Note that (11c) does not admit a closed-form solution, and 
the proposed strategy will entail a number of inner iterations 
(indexed by r), per AM iteration k. FISTA solves for O in 
(11c) through a two-step update involving gradient descent on 
<!)(.), evaluated at a linear combination of the two most recent 
iterates, followed by a closed-form soft-thresholding step per 
iteration r. Setting 0o[fc] = 1 and Wi[A:] = Ofe_i, it turns out 
that the updates can be written as [4] 



where 


X^[fc] = {Wr[k] 

- (l/L$)Vo$(W,[A:], U[A:], V[fc], S*, s*)) (15) 

with 

9r+i[k] = (1 + v/l + 402[fc])/2. 

Furthermore, 

W,+i[fc] = Or[k] + (0-[fc] - Or-l[k]) (16) 

where denotes a Lipschitz constant of Vo4>(.). Note 
that (14) is similar to the so-termed least-absolute shrinkage 
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and selection operator (Lasso) with a closed-form solution, 
namely [Or[k]]-j = ([Xr[A:]]*j)]_^, where the thresh¬ 
olding operator is defined entry-wise as := (|a;| — 

p)+sign(a;) [14, Ch. 3]. Computation of 0[fc] entails solv¬ 
ing (14) over several iterations indexed by r until convergence. 
Algorithm 1 summarizes the details of the developed commu¬ 
nity tracking scheme, with f3 and C assumed to be given as 
algorithm inputs. 

IV. Delay-sensitive operation 

Algorithm 1 relies upon convergence of the unknown vari¬ 
ables per time interval. Unfortunately, this mode of operation 
is not suitable for delay-sensitive applications, where decisions 
must be made “on the fly.” In fact, one may be willing to trade 
off solution accuracy for real-time operation in certain appli¬ 
cation domains. This section puts forth a couple of algorithmic 
enhancements that will facilitate real-time tracking, namely 
by premature termination of PG iterations, and leveraging the 
stochastic gradient descent framework. 

A. Premature termination 

For networks that generally evolve slowly over time, it is 
not necessary to run the tracking algorithm until convergence 
per time interval. Since a compromise can be struck between 
an accurate solution computed slowly and a less accurate 
solution computed very fast, one can judiciously truncate the 
number of inner iterations to fc^ax- This premature termination 
is well motivated when the network topology is piecewise 
stationary with sufficiently long coherence time, with respect 
to the number of time intervals. It is then unnecessary to seek 
convergence per time interval since it can be argued that a 
“good” solution will be attained across time intervals before the 
topology changes. If the network topology varies in accordance 
with a stationary distribution, it can be demonstrated that 
convergence will eventually be attained, even when k^ax = 1 
i.e., running a single iteration per time interval. Algorithm 2 
summarizes the steps involved in this inexact tracking scheme 
under the special case with fc^ax = 1- 


Algorithm 2 Inexact alternating minimization 

1: Input: P, C 

2: Initialize U°, VO, 0° 

3: Set 8° = 0, WO = 00,00 = 1 
4: for t = 1.. .T do 
5: Set s* = (1 -/3*)/(l - /3) 

6: Update S* = A* -b 

7. Set 
8: U‘= 

9: V‘= 

10: X* = (W* - ^Vo4-(W*,U*,V‘,S*,s*)) 

11: 0‘ = ( X*)]^ 

12: 04 = (1 -b -b 404_;^)/2 

13: W‘ = O* + ((0t_i - l)/0t) (O* - 0*-l) 

14: end for 


B. Stochastic gradient descent 

Instead of premature termination, real-time operation can be 
facilitated by resorting to stochastic gradient descent (SGD) 
iterations. Let 

C.(U,V,0,A") := ||A"-(U + 0)VT||^ (17) 

and 

(U,V,0) := y {||u||^ + ||V||^} +p.||0||^ (18) 

and consider the online learning setup, where the goal 
is to minimize the expected cost (U, V, O, A’^) -b 

(U, V, O)} (with respect to the unknown proba¬ 
bility distribution of the data). The present paper pur¬ 
sues an online learning strategy, in which the ex¬ 
pected cost is replaced with the empirical cost function 
(lA)EU[Cr(U,V,0,A-) + CA., 4 ,^(U,V, 0 )] as a sur¬ 
rogate. Generally, SGD is applicable to separable sum- 
minimization settings, in which the r-th summand is a function 
of the r-th datum. To incur the least computational and memory 
storage costs, the SGD approach advocated here discards all 
past data, and solves 

arg min Ct (U, V, O, A‘) + Ca„m* (U, V, O) (19) 

per interval t, which is tantamount to solving the EWLSE 
upon setting P = 0. This tracking scheme is reminiscent 
of the popular least mean-squares (LMS) algorithm, that has 
been well-studied within the context of adaptive estimation; 
see e.g., [26]. With Y := [U V O], a common approach to 
solving (19) involves the projected iteration (see e.g., [6]) 

Y* = [Y‘-i -,79(Ct(Y) + CA„M4Y)) |y=y-i]+ (20) 

where i9(.) denotes the subgradient operator, and rj > 0 
is a small constant step-size. A major limitation associated 
with (20) is that the resulting per-iteration solutions 0‘ are 
generally not sparse. Instead of subgradients, an AM procedure 
with EISTA updates adopted for O* is followed in a manner 
similar to Algorithms 1 and 2. The main differences stem from 
eliminating the recursive updates, and adopting constant step 
sizes to facilitate tracking. 

To this end, the following subproblems are solved during 
interval t. 


U‘ = arg min Ct (U, V‘-\0‘-\ A*) 




+ (U,V*-\0*-i) (21) 


V‘= argmin Ct(U‘,V,0‘-\A‘) 
VGiRi' 


^NxC 


+ Ca.m*(U‘,V,0*-i) (22) 


O* 


= arg min 

O^IR^xC 


Ct(U*,V*,0,A‘) 


+ Ca..m. (U‘,v‘,0). (23) 








IEEE TRANSACTIONS OF SIGNAL PROCESSING (SUBMITTED) 


5 


In order to operate in real-time, U and V can be updated by a 
single gradient descent step per time slot. Similarly, minimiza¬ 
tion of (23) across time slots entails a single FISTA update that 
linearly combines and Exact algorithmic details 

of the SGD-based community tracking algorithm are tabulated 
in Algorithm 3, with Tt(U, V,0, A*) := Ct (U,V,0, A*) -f- 

a.M* (u,v,o). 


Algorithm 3 SGD tracking algorithm 

1 : Input: {A*}^i, /?, C,au,ay 
2: Initialize U°, V°, 0° 

3: Set W° = 00,00 = 1 
4: for t = 1.. .T do 

5: U*= [U‘-i-a,VuTi(U,V*-i,0*-i,A‘)]^ 

6: V‘= [V‘-i-a„VvTi(U‘,V,0*-i,A‘)]^ 

7: X* = (W* - ^Vo$(W*,U*,V*,A‘,l)) 

8: 0‘ = ( X*)]^ 

9: 6t = {1 + 4- 40j_i)/2 

10: W‘ = O*-b ((0t_i - l)/0t) (O* - 0*-i) 

11 : end for 


V. Decentralized Implementation 

The tracking algorithms developed so far have assumed that 
connectivity data (i.e., {A*}) are acquired and processed in a 
centralized fashion. This may turn out to be infeasible, since for 
example, certain applications store large graphs over distributed 
file storage system hosted across a large network of computers. 
In fact, graphs capturing the web link structure, and online 
social networks are typically stored as “chunks” of files that 
are both distributed across computing nodes, and spatially over 
several geographical sites. In addition to the inherent compu¬ 
tational bottlenecks, soaring data communication costs would 
render centralized approaches infeasible in such scenarios. 

In lieu of these computational constraints, this section puts 
forth a decentralized algorithm that jointly tracks temporal 
communities and anomalous nodes. The alternating-direction 
method of multipliers (ADMM) has recently emerged as pow¬ 
erful tool for decentralized optimization problems [25], and 
it will be adopted here for the community tracking task. A 
connected network of computing agents is deployed, with links 
representing direct communication paths between nodes. The 
key idea is that each node iteratively solves the problem using 
only a subset of the input data, while exchanging interme¬ 
diate solutions with single-hop neighbors until consensus is 
achieved. 


it is distributed across the processing network. During time 
interval t, agent m receives the submatrix A^. To minimize 
communication costs, each agent is only allowed to send 
and receive data from its single-hop neighborhood Afm- Let 
U := [UT,...,UT]T and O := [0^ ...,0T]T, where 
e and E™=i = N. In 

terms of the per-agent submatrices, (8) can now be written as 
follows 


M r t 


arg mm 






Vgl^NxC 

X vbf, + L||u„|| j + A||v||j, + M.||o„||. 


(24) 


for t = 1,... ,T. Clearly U and O decouple across computing 
agents, whereas V does not. A viable approach entails allowing 
each agent to solve for its corresponding unknowns Um and 
Om in parallel, followed by communication of the estimates 
to a central processing node that solves for V. The downside 
to this approach is that it involves a significant communica¬ 
tion and storage overhead as the central node must receive 
intermediate values of {Um, 0^}^=!’ '^^en broadcast its 

computed value of V per iteration. In addition, this introduces 
the risk of a single point of failure at the central node. 

To operate in a truly decentralized manner, each agent will 
solve for V independently under consensus constraints requir¬ 
ing equality of the solution to those computed by single-hop 
neighbors [25]. Since Q is connected, it can be readily shown 
that consensus on the solution for V will be reached upon 
convergence of the algorithm [25]. Incorporating consensus 
constraints, each time interval entails solving the following 
fully decoupled minimization problem 

arg min 


s. to 


M 

E 

m—1 


Ur. 




|2 

If 


2M 11 11 p’ + 11 On 


{U^,o^}eR^’”^^,Vn„eR^'xc 

Vm=V„,neM„. (25) 


Letting 


4-A, (Unn, O^, V™, S^, s‘) := $ (Un„, V^, O^, S^, s*) 

At 


^IIU. 


mllF 


2M 


V^lll (26) 


A. Consensus constraints 

Consider an undirected graph Q = [M ., C) whose vertices 
M ;= {1,...,M} are M spatially-distributed computing 
agents, and whose edges C := {1,... ,L} are representative 
of direct communication links between pairs of agents. It is 
assumed that the processing network abstracted by Q is con¬ 
nected, so that (multi-hop) communication is possible between 
any pair of agents. Suppose the temporal adjacency matrix 
is partitioned as follows A* := [(Aj)^,..., (A^)^]^, and 


and introducing dummy variables {Pm}m=i’ 
written as follows 

M r 


arg min 


E 


4^At (Um, Ottt,, Vt7t5 


(25) can be 


+/rt l|P mil 1 

S. to 

Otn = Pm, v„ = V„,n e AL- (27) 
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In order to solve (27), introduce the variables 
and modify the constraints Vm = V„, n € Afm as 


'^nra ; ^nm •> ^nm ^nm: ^ 


G AC 


Introducing the dual variables {Tm}m=i^ 

and temporarily ignoring non¬ 
negativity constraints, the resulting augmented Lagrangian can 
be written as 


M 


Cp{VuV2,V3,V) = Y. 


(Um, S^, S ) 


+ A^t||P 


mill 


m=l ^ 

M r 


E 

m=l 


Tr(r^(0„-P„)) 


^ Tr{nT„(V™-x„„) + nT„(v„-x„™)} 


M r 


■E 

m=l 


I Om ~ PmllF 


neATji 


V™-X„m||F+||V„-X„^||| 


(28) 


where p > 0 , Pi :=JV™}C=i, P 2 := {U^, O^, 
and P 3 := {{X„m, denote primal variables, 

while V := {P^, {n„m, n„„}„£Ar„}m=i denotes the set of 
dual variables. 

Towards applying ADMM to (28), an iterative strategy 
is pursued, entailing dual variable update as the first step, 
followed by alternate minimization of £p(Pi,P 2 , P 3 , 22 ) over 
each of the primal variables, while holding the rest fixed to their 
most recent values. Since £p(Pi,P 2 ,P 3 ,22) is completely 
decoupled across the M computing agents, the problem can 
be solved in an entirely decentralized manner. The per-agent 
updates of the proposed algorithm during iteration k comprise 
the following steps. 

[SI]: Dual variable update. 


Tm[k + 1] = T^[k]+p{0^[k]-Pm[k]) (29a) 

n„m [k + l] = tlnm [k] + pCVmlk]- %im [k]) (29b) 
[fc + 1] = tlnm[k] + p{ym[k]- X„m [k] ) (29c) 

[S2]: Primal variable update. 


Pi[2; + 1] 

= arg min 

Cp{Vi,V2[k], 


■Pi 

P3[2;],22[fc+l])(30a) 

P2[2; + 1] 

= arg min 

£p(Pl[fc + l],P2, 


V2 



P3[2:],22[fc+l])(30b) 

P3[2; + 1] 

= arg min 
Vs 

£p(Pi[fc + l],P2[2; + l], 


P 3 , 22 [fc + l])(30c) 


It can be shown that the splitting variables in P 3 turn out 
to be redundant in the final algorithm. Letting TLm[k] := 
Y^nGATm ^ri 7 n[k], it tums out that 


and the dual variables Unm can be discarded. In addition, the 
per-iteration primal variable updates per agent simplify to (see 
Appendix A for derivations); 

'Vm[k + l]= arg min 'i>Xt{Vm[k],Om[k],'Vm,SYs*) 

+ Tr[(p/2)|AC|V^V^ + Vl{n^[k + 1] 

-(p/2){|AC|V^[fc]+ ^ V„[fc]})] (32) 

nCATn 

Um[k + 1] 

= arg min Xti'Vm,Om[k],Vm[k + s*) (33) 

Om [k + 1] 

= arg min (Um[2: + 1], O^, V™[fc + 1], S^, s*) 

+ Tr (r^[fc + 1]0^) + (p/2) IIO™ - Pm[fc]||F (34) 
Pm[2: + 1]= arg min (p/2)||Om[A: + 1] - P^Hf 

-Tr(r/,[A; + l]P^)+pJP„||i. (35) 

Algorithm 4 summarizes the steps involved in the per-agent 
decentralized ADMM algorithm. 


Algorithm 4 Decentralized tracking algorithm per agent m 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 


Input: {A(„}Ci, /3, C, p 
UC, VC, OC = PC = 0 
nC[0] = o, fC[ 0 ] = o 

for f = 1... T do 

Set s‘ = (l-/3‘)/(l-/3) 

Update SC = AC + /3SC:^ 

Update pt and At 

Vm[0] = vc-c u™[0] = uc-i 

O™[0]=P™[0] = O^i, fc = 0 

repeat 

Receive {Vn[k]}n^Mm from neighbors of m 
Pm [fc + 1]_= Pm [fc] + P{^m[k\ — Pm [2:]) 
Compute tlrn[k + 1] according to (31) 

Update Ym[k + 1] via (32) 

Update Um [2: + 1] via (33) 

Update Om [2: + 1] via (34) 

Update Pm [2: -I- 1] via (35) 

Broadcast Vm [2: -I- 1] to single-hop neighbors 
2; = fc -f 1 

until Um[2:], Vm[2:], Om[2:] converge 
UC = Um[fc], VC = Vm[2;], OC = OmM 

end for 


VI. Simulations 


nm[2: + l] =nm[2;] + (p/2) 


lATmlV^ 


[k]- Y V-W 

riGA/L / 


( 31 ) 


A. Synthetic Data 

Data generation. An initial synthetic network with N = 100, 
and (7 = 5 communities was generated using the stochastic 
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blockmodel (SBM) [15]. The SBM parameters were set to: 
6ii = 0.8,* = 1,...,5, and Sij G {0,0.1},* ^ j, selected 
so that the SBM matrix is stochastic (see Figure 2(a)). The 
initial SBM network was captured through an adjacency matrix 
Ainit g {0,1}^^^. Matrix A““ was then decomposed into 
non-negative factors U° and V°, that is, A‘“‘ = U°(V°)^, 
using standard off-the-shelf NMF tools. Anomalies were arti¬ 
ficially induced by reconstructing a modified adjacency matrix 
A° = (U° -b 0°)(V°)^, where the only non-zero rows of 
o° are indexed by (0, 25, 30,80}. Figure 2 depicts a heatmap 
of A'^, clearly showing anomalous nodes as unusually highly 
connected. 

In order to generate slowly evolving networks, four piece- 
wise constant edge-variation functions were adopted: i) /i (t) = 
H{t); ii) / 2 (f) = H{t—50); iii) / 3 (f) = 1 — H{t—50); and iv) 
f^{t) = H{t)-H(t-25) + H(t-50)-H(t-75),v/heTe H{t) 
denotes the unit step function, and t = 1,..., 100. The time- 
series {A‘}}£*} was generated by setting the edge indicators 
to alj = a°,/K(f), with k uniformly selected at random from 
11,2,3,4}. 


iTnrrF 

I1..1.1I.I1.II.I .. i 111 ■ M . 


iiiiii 


III 


III 

1 ' ,j. 

jjR 



■I., .11.1.1. J J 


a |||||r| ' ; 

. a,. ...I,ii. .,1 I ijl Li, I.l..:.i... 


Ill ,11 111 I 

^.Lj I ' ..-.’J :l.]...ii,. . K*..;i.ii, 'J. J...iiiiii 


Fig. 3: Stacked plots depicting overlapping communities de¬ 
tected over a selected sample of time intervals. Horizontal axes 
are indexed by nodes, while vertical axes depict community 
affiliation strengths that are proportional to the relative domi¬ 
nance of each color per node. As expected, most nodes exhibit 
a strong affiliation with one of the five communities. 



(a) (b) 

Fig. 2: a) SBM matrix for community generation with parame¬ 
ters 5ij € ( 0 , 0 . 1 , 0 . 8 }, decreasing away from the diagonal; b) 
Heatmap of the initial network with anomalous nodes at rows 
{0,25,30,80}. 

Numerical results. Algorithm 1 was initialized by setting O 
to an all-zero matrix, while U and V were initialized to NMF 
factors of A° computed in batch. With K = 5 and f3 = 0.97, 
Algorithm 1 was run to track the constituent communities and 
anomalies. Selection of A* and is admittedly challenging 
under dynamic settings where data are sequentially acquired. 
Nevertheless, heuristics such as increasing fit over time work 
reasonably well when the unknowns vary slowly. It turns out 
that setting At = 0.05, and fit = 0.1-\/f yielded remarkably 
good community tracking. Generally, 8—10 inner iterations 
sufficed for convergence per time step. Figure 3 depicts stacked 
plots of community affiliation strengths against node indices, 
over a selected sample of intervals. Despite the community 
overlaps unveiled by the algorithm, it is clear that all nodes 
generally exhibit strong affiliation with one of the five com¬ 
munities. 

Hard community assignment was done by associating node 
* with community m, where m = arg max,,. Uik- Figure 5 
depicts visualizations of the network at t = 10, 40, and 70, 
with node colors reflecting identified community membership 
per time interval. Nodes flagged off as anomalies are shown 



Fig. 4: Comparison of the relative error performance 
of developed algorithms with parameters set to 
Ao = 0.05,/io = 0.1,13 = 0.97. 


with larger node sizes and explicit labels. Initially the set of 
detected anomalies includes a number of false alarms. With 
acquisition of more sequential data, the algorithm is able to 
iteratively prune the set, until the ground-truth anomalies are 
identified (Figure 5 (c)). 

Further tests were conducted with Algorithms 2, 3, and 4. 
For the decentralized implementation, a simple connected 
network of 10 computing agents was adopted, as shown in 
Figure 8 . Figure 4 plots the relative error with respect to 
batch solutions, resulting from running the algorithms using the 
synthetic network time-series. The relative error during interval 
t is computed as 

liu* - uLchllF + l|v* - vLchllF +1|6* - oLchllF 
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(a) t = lO {h)t = 40 (c) t = 70 

Fig. 5: Visualization of the largest connected components with artificially-induced anomalies at f = 10,40,70. Node colors 
indicate detected communities, and anomalous members are depicted by larger node sizes with labels. Although no anomalies 
were initially detected, further data acquisition facilitated convergence to the correct set of anomalies, i.e., {0,25,30,80}. 



(a) t = l0 (b) f = 40 (c) t = 70 

Fig. 6: Demonstration of the distortive effect of anomalies with respect to community structure. Plots (a)-(c) demonstrate the 
results of running a standard outlier-agnostic algorithm, with the inaccurate conclusion that the time-series is dominated by 
three communities. 



Fig. 7: Comparison of relative error plots resulting from 
running Algorithm 1 and varying hq. 



Fig. 8: A simple connected network of 10 computing agents 
used for the decentralized community tracking task. 


where U^at(,h is the batch solution per interval t. With initial 
batch solutions, relative errors are initially small, followed by 
dramatic increases upon acquisition of sequential data. As more 
data are acquired, tracking with premature termination leads 
to faster error decay than the SGD alternative, presumably 
because it directly incorporates all past data by recursive 
aggregation. Decentralized iterations yield the slowest decay, 
because in addition to seeking convergence to the batch solu- 
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Country 

Years as anomaly 

1 

German Federal Republic 

1954,1955,1956,1966 

2 

Russia 

1953,1954,1955,1956 

3 

Austria 

1955,1957,1960 

4 

Pakistan 

1955,1956,1957 

5 

Japan 

1954,1955,1965,1967 

6 

South Korea 

1955,1965 

7 

Yugoslavia 

1954,1956 

8 

Iraq 

1967 

9 

Yemen Arab Republic 

1964,1965 

10 

Kenya 

1966,1968 


TABLE I: Anomalous countries in the global trade dataset. 


tion, consensus per agent must be attained per time-interval. 

Selection of fit is critical for joint identification of anoma¬ 
lies, as it controls the sparsity level inherent to O*. Although 
this is very challenging in time-varying settings, empirical 
investigation was used to guide selection of the “best” initial 
parameter po- Figure 7 plots the relative error (with respect to 
batch solutions) resulting from running Algorithm 1 for several 
values of fio, with (3 = 0.97, and At = Ao = 0.05 for all t. 
As seen from the plots, setting fiQ = 0.10 led to the fastest 
convergence to the batch solutions. 

B. Real Data 

Dataset description. The developed algorithms were tested 
on a time-series of real-world networks extracted from global 
trade flow statistics. Extracted under the Correlates of War 
project [3], the dataset captures information on annual bilateral 
trade flows (imports and exports) among countries between 
1870 and 2009. The network time-series were indexed by trade 
years, and each node was representative of a country, while 
directed and weighted edges were indicative of the volume and 
direction (export/import) of trade between countries measured 
in present-day U.S. dollars. 

Since trade volumes between countries can vary by orders of 
magnitude, edge weights were set to logarithms of the recorded 
trade flows. It is also important to note that some countries did 
not exist until 50 or fewer years ago. As a result, network 
dynamics in the dataset were due to arrival and obsolescence 
of some nodes, in addition to annual changes in edges and their 
weights. Since this paper assumes that a fixed set of nodes is 
available, the tracking algorithm was run for data ranging from 
1949 to 2009 (i.e., T = 60), with N = 170 countries. 

The objective of this experiment was to track the evolution 
of communities within the global trade network, and to identify 
any anomalies. Note that communities in the world trade 
network may be interpreted as regional trading consortia. 
Algorithm 3 was run with (7 = 7 communities, f3 = 0.97, 
Q-u = ctv = 0.002, At = 0.5, and fit = 1.0, for all 
t = 1,...,T. Initial values U° and V° were obtained by 
traditional NMF on A°, and 0° was set to an all-zero N x C 
matrix. 

Numerical results. Running Algorithm 3 on this dataset 
revealed interesting insights about the evolution of global trade 
within the last sixty years. Figure 9 depicts stacked plots of 
countries and the communities they belonged to over a subset 
of years within the observation period. The horizontal axes are 



Fig. 9: Overlapping communities in world trade flows dataset. 
The bottom row of plots suggests a growing trend of glob¬ 
alization, with most countries participating in several trade 
communities. 


indexed by countries, and each community is depicted by a 
specific color. It is clear that over the years, more countries 
cultivated stronger affiliations within different communities. 
This observation suggests an increasing trend of globalization, 
with more countries actively engaging in significant trade 
relationships within different trade communities. Between 1949 
and 1963, global trade was dominated by one major com¬ 
munity, while the other communities played a less significant 
role. Based on historical accounts, it is likely that such trade 
dynamics were related to ongoing global recovery in the years 
following the second world war. 

Figure 10 depicts visualizations of the global trade network 
for the years 1959 and 1990, with countries color-coded 
according to the community with which they are most strongly 
affiliated. A core community of economic powerhouses (in 
green) comprises global leaders such as the United States, 
United Kingdom, Canada, France etc., as seen from the 1959 
visualization. Interestingly, this core group of countries remains 
intact as a community in 1990. It turns out that geographi¬ 
cal proximity and language play an important role in trade 
relationships. This is evident from two communities (colored 
maroon and yellow) which are dominated by South and Central 
American nations, with Spain and Portugal as exceptions that 
have strong cultural and language influences on these regions. 

Another interesting observation from the 1990 visualization 
is the large community of developing nations (in blue). Most 
of these nations only existed as colonies in 1959, and are not 
depicted in the first drawing. However, 1990 lies within their 
post-colonial period, during which they presumably started 
establishing strong trading ties with one another. 

Finally, Table I lists countries that were flagged off by Algo¬ 
rithm 3 as anomalies. The table shows each of these countries, 
along with a list of years during which they were identified 
as anomalies. In the context of global trade, anomalies are 
expected to indicate abnormal or irregular trading patterns. 
Interestingly, the German Federal Republic, Russia, and Japan 
were some of the most adversely affected countries by the 
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(a) 1959 (b) 1990 

Fig. 10: Communities identified in the global trade network for the years 1959 and 1990. 


second world war, and their trade patterns during a period 
of rapid economic revival may corroborate identification as 
anomalies. It is also possible that South Korea was fiagged off 
in 1955 and 1965 because of its miraculous economic growth 
that started in the 1950s. 

VII. Conclusion 

This paper put forth a novel approach for jointly tracking 
communities in time-varying network settings, and identify¬ 
ing anomalous members. Leveraging advances in overlapping 
community discovery, a temporal outlier-aware edge generation 
model was proposed. It was assumed that the anomaly matrix 
is sparse, the outlier-free, noiseless factor model is low rank, 
and the network evolves slowly. Based on these assumptions, 
a sparsity-promoting, rank-regularized EWLSE was advocated 
to jointly track communities and identify anomalous nodes. A 
first-order sequential tracking algorithm was developed, based 
on alternating minimization and recent advances in accelerated 
proximal-splitting optimization. 

Motivated by contemporary needs for processing big data, 
often in streaming and distributed settings, a number of al¬ 
gorithmic improvements were put forth. Real-time operation 
was attained by developing a fast online tracking algorithm, 
based on stochastic gradient iterations. Eor settings involving 
distributed acquisition and storage of network data, a decen¬ 
tralized tracking algorithm that capitalizes on the separability 
inherent to ADMM iterations was developed. 

Simulations on synthetic SBM networks successfully un¬ 
veiled the underlying communities, and fiagged off artificially- 
induced anomalies. Experiments conducted on a sequence of 
networks extracted from historical global trade hows between 


nations revealed interesting results concerning globalization 
of trade, and unusual trading behavior exhibited by certain 
countries during the early post-world war era. 

Appendix A 

Derivation of decentralized updates 

Recalling the constraint X„m = ^nm, the per-iteration 
update in (30c) becomes 

Xnmlk -b 1] = arg min ||Vm[fc + 1] - 

+ ||V„[A: + 1]-X„™||2,| 

- Tr I (n„„ [fc + 1] + [fc + 1]) ^ X„™ I (36) 

whose solution turns out to be 

-b 1] = — + 1] + ^nm[k “h 1]^ 

+ i(V^[A: + l]+V„[fc + l]). (37) 

Assuming that II„m[0] = —n„m[0], then it can be shown by 
a simple inductive argument that Tlnm[k] = —Ilnm[k],n S 
Afm [21]. Consequently, 

^um[k] = ^{Vm[k]+V4k]) (38) 

and 

n^m[k+l]= ilnm [k] + | [k] -V^k]). (39) 
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Due to (38), note that 'X.nmlk] = ^ and that 

if n7i77T,[0] = n77T,7T,[0], then n 7 T,j 7 T,[A;] = '^'rnn[k] ■, Tl ^ J\fm- 

Focusing on the update for in (30a), and dropping the 
irrelevant terms yields 

argmin (Um[fc], Om,[fc], V^,, S^, s*) 

+ Trlj2 (nI^[A: + l](V™-X„™[/c]))l 

I neAfm J 

+ f E \\^m-Xnm[k]\\l. (40) 

Eliminating constants in the second term, one obtains 

Tr| ^ n^„[A: + l]V„l =Tr{VTn^[fc + l]} (41) 

where tlm[k + 1] := '^nm[k + 1]. Using (39), 

raSA/'m 

E fi nm [k 1] — Efl nm[k] 
nGJ\fm nCA/’m 

+ S E V-W ) (42) 

V nGAA„ / 

leading to the dual variable update 

n^[fc+ 1] = Ura[k] + ^{\Mm\- E ’ (43) 

^ V neA^m / 

Expansion of the third term in (40) leads to 

E l|Vm-X, 


P 

2 


Mimln-J IIF 


i-CA/^rr 


fTri|Ar™|V^V™-2Vl J] X,. 


(44 


fiGAC 

and (38) can be used to further expand X^neA/m '^nm a 
follows 

y] X„™ = ^V^[A:] + i y] V„[fc]. (45 

raSTVm nGA/m 

Upon substituting (45) into (44), it turns out that 


P 

2 


^ ||V^ - X„™[fc]||| = fTrjIMnlV^V, 

iGjVm ( 


- 2VT [ \Afr^\V^[k] + y] Vr,[k] 

V riGAAm / 

and 'Vm [k + 1] can be obtained by solving 

'Vm[k + 1]= argmin 4'At(Um[fc], Om[fc], V^, S^, s*; 

-.r -rr.ATxC' 




- TT[{p/2)\Mm\VlVm + Vliflmlk + 1] 

-{p/2){\Afm\V^[k]+ ^ V„[A:]})] (4 


iGAfrr 


whose closed-form solution is readily available. The remainii 
updates for Um[fc + + 1], and Pm[k + 1] folic 


in a straightforward manner from the ADMM primal variable 
updates, and they are all available in closed form. Note that 
solving for [k + 1] entails completion of squares, resulting 
in a standard Lasso problem whose per-entry solutions are 
available in closed form using the soft thresholding operator. 
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